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Goodness-of-fit Statistics and CMB Data Sets 
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Abstract. Application of a Goodness-of-fit (GOF) statistic is an essential element of parameter estimation. 
We discuss the computation of GOF when estimating parameters from anisotropy measurements of the cosmic 
microwave background (CMB), and we propose two GOF statistics to be used when employing approximate band- 
power likelihood functions. They are based on an approximate form for the distribution of band-power estimators 
that requires only minimal experimental information to construct. Monte Carlo simulations of CMB experiments 
show that the proposed form describes the true distributions quite well. We apply these GOF statistics to current 
CMB anisotropy data and discuss the results. 
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1. Introduction 



Measurement of the cosmic microwave background (CMB) 
temperature anisotropics has proven to be one of the most 
powerful tools for estimating important cosmological pa- 
> . rameters (|Nettertield et al. 20021 |Pryke et al 2002 
IRubino-Martin et al. 20021 ISievers et al 2002 
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| Wang et al 2002| . The observed angular power spectrum 
shows the coherent peak structure expected in infla- 
tionary models, and fitting model curves to the data 1 
yields constraints on many parameters. This leads in 
particular to the conclusion that the geometry of space 
is flat IjLineweaver et al 19971 Ide Bernardis et al 20001 
|Hanany et al. 2000| |Lange et al. 2000| IBalbi et al. 2000j) . 
In terms of statistics, the procedure just described is one 
of parameter estimation. 

Parameter estimation proceeds via the identification of 
a best model (set of parameters) within a family of mod- 
els, an evaluation of the quality of the fit and the con- 
struction of parameter constraints. The method of max- 
imum likelihood (ML), for example, is a useful, general 
procedure for finding a best-fit model. As a general rule, 
one must judge the quality of the fit before any serious 
consideration of parameter constraints. This requires the 
application of a Goodness-of-fit (GOF) statistic. Such a 
statistic is, usually, some scalar function of the data whose 



Send offprint requests to: douspis@astro.ox.ac.uk 

1 See http://webast.ast.obs-mip.fr/cosmo/CMB for an 
up-to-date compilation. 



distribution may be calculated once given an underlying 
physical model and a model of the statistical fluctuations 
in the data. It is generally a function gof(d,T) of both 
the data d and theory T, such that gof attains, for ex- 
ample, a minimum when d is generated by the theory T. 
It is defined in a 'monotonic' way, in the sense that gof 
becomes larger as d gets 'further' from a realization of T . 
The 'significance' may then be defined as the probability of 
obtaining gof > <?o/ b s - On this basis, it permits a quan- 
titative evaluation of the quality of the best model's fit to 
the data: if the probability of obtaining the observed value 
of the GOF statistic (from the actual data set) is low (low 
significance), then the model should be rejected. Without 
such a statistic, one does not know if the best model is a 
good model, or simply the "least bad" of the family. 

In this paper, we examine in some detail the issue of 
GOF when analysing anisotropy data on the cosmic mi- 
crowave background. The vast majority of present analy- 
ses of the power spectrum data do not include proper GOF 
evaluations. The problem is particularly complicated by 
the fact that approximate likelihood methods must be em- 
ployed in order to process the large volume of data and to 
explore a significant part of parameter space. These meth- 
ods usually rely on power spectrum estimates, such as flat 
band-powers, extracted either from scan data, or from re- 
constructed sky maps. Because the power is quadratic in 
the temperature fluctuations, it is clear that these esti- 
mates are not Gaussian distributed. The traditional ap- 
proach of \ 2 minimisation incorrectly assumes that power 
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estimates are Gaussian distributed, something that can 
lead to a bias in determining the best model (e.g., Douspis 
et al. 2001a). For the same reason, the value of the reduced 
X 2 at the best model does not retain its usual statistical 
meaning and may therefore not be simply used as a GOF 
statistic. 

Approximations to the band-power likelihood func- 
tion that permit more rigorous analyses have been pro- 
posed (Bond, Jaffe & Knox 2000; Bartlett et al. 2001). 
The question remains, however, of how to correctly evalu- 
ate the GOF of the best model. Such an evaluation re- 
quires knowledge of the distribution of the power esti- 
mates, which is not necessarily the same as the likelihood 
function. Using the same approach as Bartlett et al. (2001; 
hereafter paper 1), we propose an ansatz for the distribu- 
tion of band-power estimates and test it against Monte 
Carlo simulations of certain MAX and Saskatoon data 
sets. The ansatz requires only minimal experimental in- 
formation, and it appears to work well. We therefore use 
it to construct two GOF statistics, which we then apply 
to various ensembles of the present CMB data set. 
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2. Likelihood Method 

It is useful to begin with a discussion of GOF in the con- 
text of a complete likelihood analysis. Although computa- 
tionally challenging (in fact, impossible for large data sets: 
Bond et al. 2000; Borrill 1999ab), a likelihood approach 
is conceptually straightforward and our discussion serves 
to highlight certain important points. Such an analysis is 
in any case required for a small subset of data in order to 
test approximate methods (see, for example, Douspis et 
al. 2001a, hereafter paper 2). 

Following the notation of papers 1 and 2, we write the 
likelihood function as (we consider only Gaussian pertur- 
bations) 



C(3) = Prob(^|ef) 



1 



( 27r )JW2| C |l/2 



(1) 



where C( ) is the correlation matrix (a function of the 
model parameters O and including a contribution from in- 
strumental noise), and ~cf is column vector listing the pixel 
values 2 . The elements of O may be either the cosmologi- 
cal parameters, or a set of band-powers. Maximising the 
likelihood function over the parameters defines the "best 
model" corresponding to the parameters O best ■ 

In the present situation, we are greatly aided by the 
Gaussian form of Eq. Q in the data vector, ~at . Given the 
best model, the most obvious GOF statistic is then clearly 



gof = ~3 ■ C -~at 



(2) 



where C = C(Qbest) is the correlation matrix evaluated 
at the best model. For the Gaussian fluctuations we have 
assumed, this quantity follows a \ 2 distribution, with 

2 These 'pixels' may either be the simple pixels of a map, or 
temperature differences, as given by, for example, MAX. 



Fig. 1. Power spectrum plot of some actual CMB data 



a number of degrees-of-freedom (DOF) approximately 
equal to the number of pixels minus the number of pa- 
rameters 3 . 



3. A 



2 method 



For a variety of reasons (e.g., increased computational 
speed or inaccessible pixel data) most parameter estima- 
tions use power estimates, ST 2 , as their starting point, 
such as those shown in Figure^ A classic minimisation of 

x 2 



N,.. 



x 2 ($) = E 



5T° bs -ST n {Q) 



(3) 



is commonly used to find O best and the best model, where 
<r n = a + (cr_) if the model passes above (below) the data 
point. The obvious GOF statistic would then be the value 
of the x 2 evaluated at the minimum: gof — x 2 (©f>est)- As 
already noted, this whole procedure is inappropriate be- 
cause power estimates do not follow a Gaussian distribu- 
tion. It is of course true that if the number of contributing 
effective degrees-of-freedom 4 is large, a power estimate 
will closely follow a Gaussian; this, however, is never the 
case on the largest scales probed by a survey. We shall 



This recipe does not strictly apply in the present case, be- 
cause the parameters are non-linear functions of the data; it 
is nevertheless standard practice. In any case, the number of 
pixels is in practice much larger than the number of parame- 
ters. 

4 less than the number of original pixels by a factor depend- 
ing on the pixel-pixel correlations; see paper 1 
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see in the following that, for actual CMB data, the x 2 ap- 
proach leads to quantitatively different results than other, 
more appropriate GOF statistics. For future reference, we 
show the value of this classic x 2 m Table 1. 

4. Proposed approximation 

To improve the x 2 analysis, several authors have pro- 
posed approximations to the band-power likelihood func- 
tion C(ST) that may be constructed based on only min- 
imal information about the experimental set-up (Bond, 
Jaffe & Knox 2000; Paper 1). One then arrives at the 
likelihood as a function of cosmological parameters O 
with £[5T(Q)]. Unfortunately, these approximate likeli- 
hood functions do not retain the normalisation of the full 
likelihood over pixels (EqJIJ. This is a crucial point for 
GOF: we cannot deduce the quantity in Eq. J5J from the 
value of the approximate likelihood at its maximum. 

An alternative way to build a GOF statistic would be 
from the expected distribution of power estimates, i.e., 
the distribution of points in Figure 4 around the model 
curve. Testing the observed dispersion of actual power 
points around the best-fit model against this expectation 
amounts to a GOF. The main difficulty in this approach 
is that we do not have an expression for the distribu- 
tion of ML power estimates. It is important to understand 
that this distribution is not the same as the band-power 
likelihood, whose maximum is used to find the estimated 
power. In this section, we first motivate and then test an 
approximation to the distribution of ML power estimates. 

4.1. Motivating an ansatz 

Our approach will be the same as in Paper 1, and the 
following results thus apply when using the approximate 
band-power likelihood introduced therein. We motivated 
our likelihood approximation with an unrealistically sim- 
plified situation of iV p i x uncorrelated pixels and uniform 
noise (refered to hereafter as the simple picture) . This sug- 
gested a functional form depending on two parameters, 
an effective number of degrees-of-freedom v and a noise 
parameter (3; in the simple picture, v = N pix and [3 2 is 
the noise variance. These two parameters could be found 
in realistic situations by adjusting to published flat-band 
confidence intervals ("errors"). The particular advantage 
of such a technique is that it permits an approximate like- 
lihood analysis based on rather rudimentary information 
often found in the literature; this is an important advan- 
tage for many first generation experiments. In this same 
spirit, we now propose an ansatz for the ML band-power 
estimators. 

For the simple picture (y = iV p i X and f3 2 = noise vari- 
ance), we showed in Paper 1 that the ML band-power 
estimator, (ST 2 , was a linear transform of a Xn ■ random 
variable: 



where 5T 2 ( O ) is the band-power of the underlying model. 
In a realistic situation where v and (3 are found from pub- 
lished power estimates, there is no a priori guarantee that 
this formula applies with the same values of v and (3. One 
is, of course, tempted to suppose that the same values 
may in fact be used, at least approximatively. This hope 
forms the basis of our proposed ansatz for the band-power 
estimator distribution: 



V(6T 2 \^) oc y 072-1) e -572 



(5) 



Y[ST 2 } = 



([ST] 2 +p 2 ) 
' ({6T(d)} 2 + /3 2 ) 



([ST] 2 +p 2 ) 
([6T(3)} 2 + P 2 ) 



(4) 



The underlying model band-power ST 2 (Q) is in practice 
taken to be the ML estimate. The essential spirit of our 
approach is that, knowing the flat -band estimates and the 
68 and 95% confidence levels, one is able to reconstruct 
the entire likelihood function and (now) the probability 
distribution of the estimate ST^. 

The only way to be sure that this proposed method 
actually works is by testing it against Monte Carlo simu- 
lations of some experiments before generalised it. We men- 
tion at least one reason for caution: the quantity v repre- 
sents an effective number of DOF, reduced from Af p j x by 
inter-pixel correlations, applicable to the likelihood func- 
tion; it is not at all clear that this same effective DOF 
applies equally well to the power estimator distribution 
(as it does in the simple picture). In particular, note that 
since the same data where used to find the best-fit model, 
we might expect a reduction in DOF, something familiar 
from the classic reduced \ 2 test. Here, however, we have 
no clear idea of the reduction. Fortunately, the proposed 
method nevertheless appears valid, as the following Monte 
Carlo simulations demonstrate it. 

4.2. Testing the ansatz 

We simulated many different data realizations of the MAX 
ID (Clapp et al. 1994) and Saskatoon (Netterfield et al. 
1996) experiments in order to reconstruct the correspond- 
ing ML power estimator distribution. For example, we ran 
30000 realizations of MAX ID at a frequency of 3.5cm" 1 
in the following manner: we first compute the flat band- 
power and the one dimensional likelihood function for the 
actual observational data. Knowledge of the latter pro- 
vided the value of the pair (v,(3). The maximum of the 
likelihood function gave us the "best model", which was 
used to simulate pixels on the sky. In order to take into 
account all correlations, we simulated our pixels using the 
full pixel-pixel correlation matrix. We first computed the 
theoretical part of the correlation matrix evaluated for 
our "best model". After diagonalization, we drew 30000 
realisations of 21 pseudo-pixels from a Gaussian distribu- 
tion centered on and with the variances given by the 
eigenvalues. We reconstructed the "true" sky pixels us- 
ing the transformation matrix (eigenvector matrix) and 
adding realizations of Gaussian noise (given by the known 
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Fig. 2. Distribution of the ML flat band-power estima- 
tor for MAX ID 3.5 cmT 1 found by Monte Carlo simula- 
tion. The smooth (red) curve is the approximation Eq. J^J), 
which fits the distribution well. 

noise correlation matrix). We thus obtained 30000 sets 
of 21 pixels, correlated and drawn according to the best 
model (STfb — 57.3/iAT). For each realization, we derived 
the ML power estimate and build a histogram of its dis- 
tribution. 

Figure El shows the resulting distribution for 
MAX ID 3.5 cm -1 . Overplotted in red as the smooth 
curve is the ansatz Eq. JSJ with the same values of [y, 0) 
as found from the likelihood function. We see that the 
proposed approximate distribution is indeed a good rep- 
resentation of the true power estimator distribution. 

The same kind of analysis was performed for the 
Saskatoon K-band 3-point data, an altogether different 
observing strategy. Once again, the approximation fitted 
the distribution to high accuracy. On the basis of these 
agreements, we will now adopt the proposed form in Eq. 
(jHJ as a good representation of the distribution of ML 
band-power estimators. 

4.3. From probability function to GOF 

On the basis of the distribution Eq. J5| , we now construct 
two GOF statistics. The goal is to define a scalar quantity, 
gof , that measures the scatter of points around a given 
model and whose distribution is known under the hypoth- 
esis that this model represents the "truth" (the null hy- 
pothesis) . An improbable value of gof would indicate that 
there is a problem. 

Both constructions assume that the band-powers are 
independent. This of course is not strictly true, but gener- 
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Fig. 3. Individual Vi (red squares) and x 2 (blue triangles) 
for a subset of the data plotted in figure 1. 

ally speaking published band-powers do not have strong 
statistical correlations; for example, the residual correla- 
tion between the Saskatoon bands is at a level of ~ 10%. 
Calibrations errors, on the other hand, do induce impor- 
tant band-band correlations. As already mentioned, the 
present work does not include calibration errors, and any 
"bad fit" indicated by our GOF tests could indicate either 
a false model, or that calibration errors are important. Our 
aim here is to show the ability of a proper GOF to identify 
problems with CMB power data fits, and to demonstrate 
the advantage of the two proposed GOF statistics on the 
naive and inappropriate classic x 2 - 

4.3.1. Generalized x 2 

For each band-power i, consider the variables c*i defined 
as follows: 

/ -= eM~x 2 /2)dx = pi (6) 
J-oc V 71- 

where pi = J_^ 2 Vi{8T 2 \Q)d8T 2 is calculated using Eq. 
© ■ The on are thus Gaussian random variables with zero 
mean and a variance of unity. Hence, the sum go] = 
J2i° mp a 2 follows a x 2 distribution with A^and DOF and 
provides a handy GOF statistic. 

4.3.2. Characteristic functions 

Another way to define a GOF statistic for a fit to A^and 
power points relies on the following property of character- 
istic functions: the characteristic function for the sum of 
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independent random variables is given by the product of 
the individual characteristic functions. Given the iVband 
random variables Yj and their probability distributions "P% 
(Eq. , we calculate the distribution of the random vari- 
able z = which will represent the goodness of fit, 
as follows: for each Yi we can compute the correspond- 
ing characteristic function Then using the prop- 
erty cited above, we can construct the characteristic func- 
tion $ z (fe) of the variable z by <& z (k) = ®N h!md (k). 
The probability distribution function of z, T(z\ is thenxj 
given by the inverse Fourier transform of $ z (fc). This ap^ 
proach is particularly straightforward in our case because 
the probability function given in Eq. Q is just a x 2 l aw 
with z/j DOF, whose characteristic function <F; has an an- 
alytic form. Multiplication of the individual characteris- 
tic functions thus gives an analytical expression whose in- 
verse Fourier transform is itself a % 2 distribution in z, with 

f = E"< D0F: 

T(z) = z v ' 2 e~ z/2 (7) 
with z = Yi and v = V{ 

i 

The variable gof = z is thus (another) % 2 -distributed 
quantity that provides a useful GOF statistic. 

5. "The good, the bad and the GOF" or Are 
CMB fluctuations consistent with a Gaussian 
distribution? 

5.1. Application 

In this section we apply each of the above GOF statistics 
to the CMB data set shown in Figure^ note that this does 
note include the most recent BOOMERanG, MAXIMA 
and DASI results. Adding these new data will essentially 
results in reducing the "x 2 " distributed gof values with- 
out changing drastically the results presented in this sec- 
tion. Our overall approach is as described in Le Dour et al. 
2000 (hereafter paper 3) and Douspis et al. 2001a, where 
we used the likelihood approximation given in paper 1 to 
find the best model. We consider three combinations of 
data: Data set 1 contains all points (ALL) 5 ; set 2 consists 
of all the data minus the Python 5 results (Coble et al. 
1999) (ALL-5); and set 3 combines just COBE (Tegmark 
& Hamilton 1997), MAXIMA (Hanany et al. 2000) and 
BOOMERanG (de Bernardis et al. 2000) (CMB). The 
best model for each data set will be referred to as BMall , 
BMall-5,BMcmb- A summary of the various GOF statis- 
tics for these models is given in Table 1; the lines are la- 
belled by GC for "generalised x 2 " , CF for "characteristic 
functions", and "x 2 " for the classic x 2 of Eq. (TJJl 6 . 

5 Actually, we noticed that our approximation fails to recover 
the MC simulations for upper limits. For this reason we do not 
include them in our analysis 

6 We noticed that <n is given different definitions in the lit- 
erature. When considering the evaluation of the GOF using 
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Fig. 4. Results of the GOF for each subset given by our 
CF test. The values of the GOF with the characteristic 
function technique are given by the blue shaded part for 
ALL subset and the red arrow for the ALL-5 subset. 





ALL 


ALL-5 


CMB 


GC 


0.02% 


8.6% 


60.0% 


CF 


0.3% 


51.0% 


63.0% 


x 2 


0.004% 


1.1% 


55.0% 



Table 1. Values of the GOF of the "best models" for 
each subset of the actual data. GC is for "generalised x 2 " , 
CF for the "characteristic functions" technique and x 2 f° r 
the classic % 2 . 



For GC technique, the gof is directly equivalent to the 
absolute value of a x 2 ■ To convert this into percentage, we 
need to know the DOF. The latter is given in our case by 
the number of experiments taken into account in each set 
minus the number of free cosmological parameters. 

For the CF technique, the percentage given in table 
1 is obtained by integrating the probability distribution 
function / of Z Y from infinity to Z obs = ^f ""* %f wnere 
N set is the number of experiments in each set. 

The figure fj| summarises the results given by our CF 
test on both data sets 1 and 2. The line gives the function 
to integrate and the shaded part is the integrated part 
corresponding to the numbers given in Table l5~T1 The solid 
(blue) line and shaded part correspond to data set 1, and 
the dashed (red) line and arrow to data set 2. 



each definition, we found that the value of the GOF is quite 
sensitive to the definition of We consider in this paper the 
technique giving the best value of the GOF 



6 



M. Douspis 1 ' 2 , J.G. Bartlett 1 ' 3 , A. Blanchard 1 : Goodness of fit in CMB 



5.2. Discussion 

The first remark to be made based on Table 1 is that the 
complete data set (set 1) is inconsistent with a Gaussian 
sky fluctuations, according to all three techniques; the GC 
method, for example, excludes this hypothesis at more 
than 99.99 %. This means in particular that it is not ap- 
propriate to search cosmological constraints, because the 
whole class of models considered is ruled out. This could 
be due to several effects, in particular the fact that we do 
not include calibration uncertainties in our analysis. 

The situation is different if we remove Python 5 (set 
2) from the analysis. In this case, our two evaluations of 
the GOF (GC and CF) both accept the hypothesis of 
Gaussian sky fluctuations. In contrast, the classic (but 
inappropriate) y 2 statistic marginally excludes such hy- 
pothesis. Figure 3 illustrates the difference between our 
GC method and the classic x 2 , data point by data point 
(for a subset of data set 1). Triangles show individual \ 2 
values, while boxes correspond to the Vi defined in sec- 
tion 3. We see that the classic x 2 overpenalizes the fit for 
outliers, a conclusion already noted in paper 2. 

Finally, we can see that all three methods accept 
the Gaussian hypothesis as a good representation to the 
COBE, MAXIMA and BOOMERanG data (set 3). 

6. Conclusion 

We have discussed three different ways of estimating the 
GOF to CMB band-powers. A GOF statistic is a key el- 
ement of any parameter estimation study, and a good fit 
must be insured before considering parameter constraints. 
The classic x 2 GOF statistic is not rigorously applicable 
to power spectrum data, because power estimates are not 
Gaussian distributed quantities. We propose instead two 
alternative GOF statistics based on an approximation to 
the distribution of power estimators. This approximation 
was motivated by the same kind of arguments presented 
in paper 1 for the likelihood function. The distribution of 
a power estimator is a different quantity than the like- 
lihood function used to define the estimator. We tested 
the approximation presented here against Monte Carlos 
simulations of CMB observations and found that it re- 
produced well the distribution of the maximum likelihood 
band-power estimator. 

We then constructed two different GOF statistics, 
whose distributions were found using the approximate 
power estimator distribution. With the same, rather min- 
imal information required to build the likelihood approxi- 
mation (paper 1), we are now also able to develop a GOF 
statistic to test the quality of the maximum likelihood 
model to a set of band-power data, thereby allowing a 
complete statistical analysis of anisotropy data from di- 
verse observations. The method is limited by the fact that 
we are unable to account for correlations between band- 
powers; this, however, is not a serious restriction, as these 
correlations are usually rather unimportant for the final 
results based on current data sets. 



In applying this approach to a set of band-power data 
of Figure [T] we found that the "best model" obtained is in 
fact a bad fit. In other words, the data are unlikely to have 
been drawn from a Gaussian distribution represented by 
such a model. The fit becomes acceptable if we exclude 
the Python 5 points from the analysis, according to our 
GOF statistics. This is most likely due to the fact that 
we do not account for calibration errors, and so the bad 
fit probably just indicates that the adopted calibration is 
incorrect. It is interesting to note that, even with Python 
5 removed, the classic x 2 still marginally rejects the best 
fit. We traced this behaviour to the fact that this method 
over weights the importance of "outliers" . 

The important cosmological conclusion is that this 
CMB data set (excluding Python 5, due to our inabil- 
ity to account for calibration errors) is consistent with 
Gaussian sky fluctuations drawn from the best-fit infla- 
tionary model. 

A final remark concerns the possibility offered by the 
development of an approximated distribution function of 
the estimators. In the application of current Monte Carlo 
methods for C/s extraction (eg. Szapudi et al. 2000, 
MASTER: Hivon et al. 2001), the estimator distribution 
is a natural output. The likelihood function needed in 
parameter estimations is however unknown. The present 
study suggests that we could reconstruct the likelihood 
function directly from the estimator distribution. The two 
parameters (y and (3) can be fitted on the estimator dis- 
tribution and then used in the approximated likelihood 
function of Bartlett et al. 2001. Consequently one is then 
able to reconstruct all the likelihood function and to per- 
form a proper parameter estimation. 

Acknowledgements. M. D. would like to thank Nabila Aghanim 
for usefull comments and corrections. 
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